Model Selection

128k ultra-long context

# 128k ultra-long context

Qwen3 128k 30B A3B NEO MAX Imatrix Gguf

GGUF quantized version based on Qwen3-30B-A3B Mixture of Experts model, extended to 128k context, optimized with NEO Imatrix quantization technology, supporting multilingual and multitask processing.

Large Language Model Supports Multiple Languages

Qwen3 32B 128k NEO Imatrix Max GGUF

This is the NEO Imatrix quantized version of the Qwen3-32B model, using BF16 format to maximize output tensors for improved inference/generation capabilities, supporting a 128k context length.

Large Language Model

Qwen3 32B 128k HORROR Imatrix Max GGUF

A horror-themed text generation model optimized based on Qwen3-32B, enhanced with Imatrix quantization technology for improved reasoning, supporting 128k ultra-long context

Large Language Model

Mistral Small 3.1 24B Instruct 2503 MAX NEO Imatrix GGUF

A 24B parameter instruction-tuned model by Mistralai, supporting 128k context length and multilingual processing, enhanced with Neo Imatrix technology and MAX quantization scheme

Large Language Model Supports Multiple Languages

Gemma 3 12b It MAX HORROR Imatrix GGUF

A horror-style instruction-tuned version based on Google's Gemma-3 model, featuring Neo Imatrix technology and extreme quantization, supporting 128k context length

Large Language Model

Llama 3.3 70b Instruct Awq

Llama 3.3 is a multilingual large language model developed by Meta, with 70 billion parameters, optimized for multilingual dialogue use cases, and demonstrates excellent performance in multiple benchmarks.

Large Language Model

Transformers Supports Multiple Languages

Llama 3.2 3B Instruct NEO SI FI GGUF

A 3B-parameter instruction-tuned model based on the Llama-3.2 architecture, incorporating the NEO IMATRIX sci-fi dataset, supporting 128k long-context generation

Large Language Model Supports Multiple Languages

Llama 3.1 405B FP8

Meta Llama 3.1 is a multilingual large language model collection, including 8B, 70B, and 405B parameter pre-trained and instruction-tuned generative models, supporting 8 languages with outstanding performance on industry benchmarks.

Large Language Model

Transformers Supports Multiple Languages

Llama 3.1 405B Instruct FP8

Meta Llama 3.1 is a multilingual large language model series, including pre-trained and instruction-tuned generative models with 8B, 70B, and 405B scales. The 405B version is optimized for multilingual dialogue scenarios and performs excellently in common industry benchmarks.

Large Language Model

Transformers Supports Multiple Languages

Llama 3.1 70B Instruct

Meta Llama 3.1 is a set of pretrained and instruction-tuned generative models with 8B, 70B, and 405B parameters, optimized for multilingual conversation scenarios, supporting 8 languages and code generation.

Large Language Model

Transformers Supports Multiple Languages

LLaMA 3.1 is a multilingual large language model series released by Meta, available in 8B, 70B, and 405B sizes, supporting 8 languages, with outstanding performance in industry benchmarks.

Large Language Model

Transformers Supports Multiple Languages

Meta Llama 3.1 is a large language model series supporting 8 languages, available in 8B/70B/405B scales, outperforming most open-source and proprietary chat models in industry benchmarks

Large Language Model

Transformers Supports Multiple Languages

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase